Query Answering over Functional Dependency Repairs

نویسنده

  • Artur Galiullin
چکیده

Inconsistency often arises in real-world databases and, as a result, critical queries over dirty data may lead users to make ill-informed decisions. Functional dependencies (FDs) can be used to specify intended semantics of the underlying data and aid with the cleaning task. Enumerating and evaluating all the possible repairs to FD violations is infeasible, while approaches that produce a single repair or attempt to isolate the dirty portion of data are often too destructive or constraining. In this thesis, we leverage a recent advance in data cleaning that allows sampling from a well defined space of reasonable repairs, and provide the user with a data management tool that gives uncertain query answers over this space. We propose a framework to compute probabilistic query answers as though each repair sample were a possible world. We show experimentally that queries over many possible repairs produce results that are more useful than other approaches and that our system can scale to large datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Formal Framework For Probabilistic Unclean Databases

Traditional modeling of inconsistency in database theory casts all possible “repairs” equally likely. Yet, effective data cleaning needs to incorporate statistical reasoning. For example, yearly salary of $100k and age of 22 are more likely than $100k and 122 and two people with same address are likely to share their last name (i.e., a functional dependency tends to hold but may occasionally be...

متن کامل

Some Research Directions in Consistent Query Answering: A Vision

Research in consistent query answering (CQA) in databases was initiated in the database community with the publication of [1], where the main goal was to formalize the notion of consistent answer to a query posed to a possibly inconsistent database, i.e. that fails to satisfy a given set of integrity constraints (ICs) that are not enforced by the system. For many reasons [9], such inconsistenci...

متن کامل

On the finite controllability of conjunctive query answering in databases under open-world assumption

In this paper we study queries over relational databases with integrity constraints (ICs). The main problem we analyze is OWA query answering, i.e., query answering over a database with ICs under open-world assumption. The kinds of ICs that we consider are inclusion dependencies and functional dependencies, in particular key dependencies; the query languages we consider are conjunctive queries ...

متن کامل

Exchange-Repairs: Managing Inconsistency in Data Exchange

In a data exchange setting with target constraints, it is often the case that a given source instance has no solutions. Intuitively, this happens when data sources contain inconsistent or conflicting information that is exposed by the target constraints at hand. In such cases, the semantics of target queries trivialize, because the certain answers of every target query over the given source ins...

متن کامل

Consistent Query Answers in the Presence of Universal Constraints

The framework of consistent query answers and repairs has been introduced to alleviate the impact of inconsistent data on the answers to a query. A repair is a minimally different consistent instance and an answer is consistent if it is present in every repair. In this article we study the complexity of consistent query answers and repair checking in the presence of universal constraints. We pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013